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Abstract — We consider the problem of securing a multicast 
network against a wiretapper that can Intercept the packets on 
a hmlted number of arbitrary network links of his choice. We 
assume that the network Implements network coding techniques 
to simultaneously deliver all the packets available at the source 
to all the destinations. We show how this problem can be looked 
at as a network generalization of the Ozarow-Wyner Wiretap 
Channel of type II. In particular, we show that network security 
can be achieved by using the Ozarow-Wyner approach of coset 
coding at the source on top of the Implemented network code. 
This way, we quickly and transparently recover some of the 
results available In the literature on secure network coding for 
wiretapped networks. We also derive new bounds on the required 
secure code alphabet size and an algorithm for code construction. 



I. Introduction 

Consider a communication network represented as a di- 
rected graph G = {V, E) with unit capacity edges, an 
information source S that multicasts information to t receivers 
Ri, . . . ,Rt located at distinct nodes. Assume that the min-cut 
value between the source and each receiver node is n. We 
know that a multicast rate of n is possible with linear network 
coding [1], [2]. We are here concerned with multicast networks 
in which there is an adversary that can access data on a certain 
number of links of his choice, and the goal is to maximize the 
multicast rate with the constraint of revealing no information 
about the multicast data to the adversary. 

The problem of making a linear network code information 
theoretically secure in the presence of a wiretap adversary 
that can look at a bounded number, say /i, of network edges 
was first studied by Cai and Yeung in [3]. They considered 
directed graphs and demonstrated the existence of a code over 
an alphabet with at least (' ') elements which can support a 
secure multicast rate of up to n— /i. They also showed that such 
codes can be designed in 0((' ')) steps. The required edge 
bandwidth and the secure code design complexity are main 
drawbacks of this pioneering work. Feldman et al. derived 
trade-offs between security, code alphabet size, and multicaat 
rate of secure linear network coding schemes in [4], by using 
ideas from secret sharing and abstracting network topology. 
Another approach was taken by Jain in [5] who obtained 
security by merely exploiting the topology of the network in 
question. Weakly secure network coding (which insures that 
only useless information rather than none is revealed to the 
adversary) was studied by Bhattad and Narayanan in [6], and 
practical schemes are missing in this case as well. 

A related line of work considers a more powerful adversary, 
one that can also modify the packets he observes. Modifying 



a certain number of packets in networks which only route 
information simply results in their incorrect reception, whereas 
modifying the same number of packets carrying linear combi- 
nations of source packets can have a more harmful effect since 
it can result in incorrect decoding of all source packets. Such 
attacks are in network coding literature known as Byzantine 
modifications, and the Byzantine modification detection in 
networks implementing random network coding was studied 
by Ho et al. in [7] and Jaggi et al. in [8]. The approach they 
take is to introduce error correction coding at the source so 
that the packets carry not only data but also some redundant 
information derived from data which will help reduce the 
probability of incorrect decoding. 

We also find coding at the source a natural approach to 
address the information theoretic security of wiretap networks. 
In a network where the min-cut value between the source and 
each receiver node is n and an adversary can access up to 
fj, edges of his choice, we introduce at the source a coding 
scheme which ensures information theoretic security on the 
Ozarow-Wyner wiretap channel type 11, introduced in [9] and 
[10], where the source transmits n symbols to the receiver and 
an adversary can access any ^ of those symbols. 

Ozarow and Wyner showed that the maximum number of 
symbols (say k) that the source can communicate to the 
receiver securely in the information theoretic sense is equal to 
n — /i. They also showed how to encode the k source symbols 
into the n channel symbols for secure transmission. Clearly, 
if the n channel symbols are multicast over a network not 
performing coding (linear combining of the n symbols), the k 
source symbols remain secure in the presence of an adversary 
with access to any ^ edges. We will illustrate later that this is 
is not necessarily the case when network coding is performed. 
However, we will show that a network code that preserves 
security of the k source symbols (coded into the n multicast 
symbols in the Ozarow-Wyner manner) can be designed over 
a sufficiently large field. 

With the observations made by Feldman et al. in [4], it is 
easy to show that our scheme is actually equivalent to the 
one proposed in the pioneering work of Cai and Yeung in [3]. 
However, with our approach, we can quickly and transparently 
recover some of the results available in the literature on secure 
network coding for wiretapped networks. Since the publication 
of [3] in which the network code construction is based on the 
work of Li et al. in [2], a number of simpler network code 
construction algorithms have been proposed (see for example 
[11]), [12], Computational complexity of network coding in 
terms of the number of coding nodes and ways to minimize 



it have also been studied since then [12], [13], [14]. We will 
use these results to derive new bounds on the required secure 
code alphabet size and an algorithm for code construction. 

This paper is organized as follows: In Sec. [Ill we briefly 
review the Ozarow-Wyner wiretap channel type II problem. 
In Sec. Hn] we introduce the network generalization of this 
problem. In Sec. |IV] we present an algorithm for secure 
network code design and discuss the required code alphabet 
size. In Sec. [V] we highlight some connections of this work 
with the previous work on secure network coding and more 
recent work on network error correction. 

II. Wiretap Channel II 

We first consider a point-to-point scenario in which the 
source can transmit n symbols to the receiver and an adversary 
can access any ji of those symbols [9], [10]. For this case, we 
know that the maximum number of symbols that the source 
can communicate to the receiver securely in the information 
theoretic sense is equal to n — /i. 

The problem is mathematically formulated as follows. Let 
S = (si, S2, . . . , Sk) be the random variable associated with 
the k information symbols that the source wishes to send 
securely, Y — {yi,y2, ■ ■ ■ , Vn) the random variable associated 
with the symbols that are transmitted through the noiseless 
channel between the source and the receiver, and Z = 
{zi,Z2, ■ ■ ■ ,Zfj,) the random variable associated with the wire- 
tapped bits of Y. When k < n ~ fi, there exists an encoding 
scheme that maps S into Y so that the uncertainty about S 
is not reduced by the knowledge of Z and S is completely 
determined (decodable) by the complete knowledge of Y, that 
is, 

H{S\Z) = H{S) and H{S\Y)=Q. (1) 

For n ~ 2, k ~ 1, fj, ^ 1, such a coding scheme can be 
organized as follows. If the source bit equals 0, then either 00 
or 11 is transmitted through the channel with equal probability. 
Similarly, if the source bit equals 1, then either 01 or 10 is 
transmitted through the channel with equal probability. 

source bit si: 1 

codeword yij/2 chosen at random from: {00, 11} {01, 10} 

It is easy to see that knowledge of either yi or y2 does 
not reduce the uncertainty about si, whereas the knowledge 
of both 2/1 and 2/2 is sufficient to completely determine si, 
namely, si =yi +2/2- 

In general, k = n — fi symbols can be transmitted securely 
by a coding scheme based on an [n, n — k] linear MDS code 
C C F" In this scheme, the encoder is a probabilistic device 
which operates on the space F" where q is a large enough 
prime power, partitioned into q'^ cosets of C. The k information 
symbols are taken as the syndrome which specifies a coset, and 
the transmitted word is chosen uniformly at random from the 
specified coset. The decoder recovers the information symbols 
by simply computing the syndrome of the received word. 
Because of the properties of MDS codes, knowledge any 
/i = n — fc or fewer symbols will leave uncertainty of the k 
information symbols unchanged. The code used in the above 
example is the [2, 1] repetition with the parity check matrix 



H=[l 1] 



(2) 



III. Wiretap Network II 

We now consider again an acyclic multicast network G = 
(y, E) with unit capacity edges, an information source, t 
receivers, and the value of the mincut to each receiver equal 
to n. The goal is to maximize the multicast rate with the 
constraint of revealing no information about the multicast data 
to the adversary that can access data on any ij. links. We 
assume that the adversary knows the implemented network 
code, i.e. all the coefficients of the linear combinations that 
determine the packets on each edge. Moreover, the adversary 
is aware of any shared randomness between the source and 
the destinations. The last assumption rules out the use of 
traditional "key" cryptography to achieve security. 

We know that a multicast rate of n is possible with linear 
network coding [1], [2]. It is interesting to ask whether, 
using the same network code, the source can multicast k < 
n — n symbols securely if it first applies a secure wiretap 
channel code (as described above) mapping k into n symbols. 
Naturally, this would be a solution if a multicast rate of n can 
be achieved just by routing. 

Consider this approach for the butterfly network shown in 
Fig. [T] where we have n = 2, fc = 1, /i = 1. If the source 
applies the coding scheme described in the previous section 
and the usual network code as in Fig. [T]-a, the adversary will 
be able to immediately learn the source bit if he taps into 
any of the edges BE, EF, ED. Therefore, a network code can 
brake down a secure wiretap channel code. However, if the 
network code is changed so that node B combines its inputs 
over e.g., F3 and the BE coding vector is [l a] where a is 
a primitive element of F3 (as in Fig. [TJ-b), the wiretap channel 
code remains secure, that is, the adversary cannot gain any 
information by accessing any single link in the network. Note 
that the wiretap channel code based on the MDS code with 
i/ = [1 1] remains secure with any network code whose BE 
coding vector is linearly independent of [l l] . 

We will next show that the source can multicast k < n — ^ 
symbols securely if it first applies a secure wiretap channel 
code based on an MDS code with a k xn parity check matrix 
H if the network code is such that no linear combination of 
fi — n — k or: fewer coding vectors belongs to the space 
spanned by the rows of H. Let W C E denote the set 
of \W\ = /i edges the wiretapper chooses to observe, and 
Z\Y = {zi, Z2, ■ ■ ■ ,Zf^) the random variable associated with 
the packets carried by the edges in W. Let Cw denote 
the matrix whose rows are the coding vectors associated 
with the observed edges in W. As in the case of wiretap 
channel, S = (si, S2, . . . , s^) denotes the random variable 
associated with the k information symbols that the source 
wishes to send securely, and Y ~ {yi,y2, ■ ■ ■ ,yn) the random 
variable associated the n wiretap channel code symbols. The 
n symbols of Y will be multicast through the network by 
using linear network coding. Consider H{S, Y, Zw) with the 
security requirement H{S\Zw) = H{S) for all W C E: 

H{S\Zw) +H{Y\SZw) = H{Y\Zw) + H{S\YZw) 

=H(S) 

=> H{Y\SZw) = H{Y\Zw) - H{S) 
=4> < n — rank(CTv) — k 




fi. That is. 



rank 



Fig. 1. Single-edge wiretap butterfly network with a) insecure network code 
and b) secure network code. 



Since there is a choice of edges such that Tank{Cw) = fJ-, the 
maximum rate for secure transmission is bounded as 

k < n — II. 

If the bound is achieved with equahty, we have H(Y\SZw) — 
and consequently, the system of equations 
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for all Cw s.t. mnk{Cw) = M- (3) 



has to have unique solution for all W for which rank(CvF) 



This analysis essentially proves the following result: 

Theorem 1: Let C = (V, E) be an acyclic multicast net- 
work with unit capacity edges, an information source and the 
mincut value to each receiver equal to n. A wiretap code 
at the source based on an MDS code with a k x n parity 
check matrix H and a network code such that no linear 
combination of /i = n — fc or fewer coding vectors belongs 
to the space spanned by the rows of H make the network 
information theoretically secure against a wiretap adversary 
who can observe at most jj. < n ~ k edges. Any adversary 
able to observe more than n — k edges will have uncertainty 
about the source smaller than k. 

The above analysis shows that the maximum throughput 
can be achieved by applying a wiretap channel code at the 
source and then designing the network code while respecting 
certain constraints. The decoding of secure source symbols S 
is then merely matrix multiplication of the decoded multicast 
symbols Y . The method gives us a better insight of how much 
information the adversary gets if he can access more edges 
than the code is designed for. It also gives us an insight on 
how to simply design secure network codes in some cases over 
much smaller alphabets then currently deemed necessary. Both 
claims are illustrated in the example below. 

IV. Network Code Design Alphabet Size 

The approach described previously in the literature for 
finding a secure multicast network code consisted of decou- 
pling the problem of designing a multicast network code and 
making it secure by using some code on top of it. Feldman 
et al. showed in [4] that there exist networks where the 
above construction might require a quite large field size. We 
investigate here a different construction that, as was hinted in 
the conclusion of [4], exploits the topology of the network. 
This is accomplished by incorporating the security constraints 
in the Linear Information Flow (LIF) algorithm of [11] that 
constructs linear multicast network codes in polynomial time 
in the number of edges in the graph. The result is a better lower 
bound on the sufficient field size. However, the modified LIF 
algorithm does not have polynomial time complexity. 

We start by giving a brief high level overview of the LIF 
algorithm of [ 11 ] . The inputs of the algorithm are the network, 
the source node, the t destination nodes and the number n 
of packets that need to be multicast to all the destinations. 
Assuming the min-cut between the source and any destination 
is at least n, the algorithm outputs a linear network code that 
guaranties the delivery of the n packets to all the destinations. 

The algorithm starts by 1) finding t flows Fi,F2, . . . ,Ft of 
value n each, from the source to to each destination and 2) 
setting t n X n matrices Bp (one for each receiver) equal 
to Inxn Then, it goes over the network edges, visiting each 
one in topological order. In each iteration, the algorithm finds a 
suitable local encoding vector for the visited edge, and updates 
the t matrices Bp., each formed by the global encoding vectors 
of the n last visited edges in the flow Fj. The algorithm 
maintains the invariant that the matrices Bp remain invertible 



after each iteration. Thus, when it terminates, each destination 
will get n linear combination of the original packets that form 
a full rank system. Thus each destination can solve for these 
packets by inverting the corresponding system. 

An important result of the previous algorithm, is that a 
field of size at least t (the number of destinations) is always 
sufficient for finding the desired network code. As shown in 
[11, Lemma 8], this follows from the fact that a field of size 
larger or equal to t is actually sufficient for satisfying the 
condition that the t matrices Bp are always invertible. 

We modify the LIF algorithm so it outputs a secure network 
code in the following way. We fix the k x n parity check 
matrix H. WLOG, we assume that the /i packets observed 
by the wiretapper are linearly independent, i.e. rank Cw = 
fj,. We denote by e; the edge visited at the z-th iteration of 
the LIF algorithm, and by Pi the set of the edges that have 
been processed by the end of it. Then, we extend the set of 
invariants to make sure that the encoding vectors are chosen 



so the matrices Mw = 



o„ 



are also invertible; which by 



Theorem \T\ achieves the security condition. More precisely, 
using the same techniques as the original LIF algorithm, we 
make sure that by the end of the ith iteration, the matrices Bp 
and the matrices Mwi are invertible; where Wi = {e^} U W 
and W is a subset of Pi of order ii — l — n — k—1. The total 
number of the matrices that need to be kept invertible in this 
modified version of the LIF algorithm is at most (' J_^-^ ) + ^ 
(which corresponds to the last iteration). Thus, similarly as in 
[11, Lemma 8], we obtain the following improved bound on 
the alphabet size for secure multicast: 

Theorem 2: Let G = {V, E) be an acyclic network with 
unit capacity edges, an information source, and the mincut 
value to each of the t receivers equal to n. A secure mulitcast 
at rate A; < n in the presence of a wiretapper who can observe 
at most n < n — k edges is always possible over the alphabet 
¥q of size 

Bound (|4|i can be further improved by realizing as was 
first done in [12] that not all edges in the network carry 
different linear combination of source symbols. Langberg et al. 
showed in [13, Thm. 5] that the problem of finding multicast 
network codes for a network G can be reduced to solving 
the same problem for a special equivalent network G with 
same parameters n and t, which has the properties that all 
nodes except the source and the destinations have total degree 
3 and at most n^t^ of its nodes have in-degree 2. These nodes 
are called encoding nodes, whereas the other ones are called 
forwarding nodes since the packets carried by their outgoing 
edges are just copies of the ones available at their single 
incoming edge. Given a network code for G, a one for G 
can be found efficiently over the same field. And, the set of 
global encoding vectors of the edges of G would be a subset 
of the one of G. 

Going back the security problem over a network G, one can 
try to find a secure network code for the equivalent network 
G, and then use the procedure described in [13] and [14] to 
construct a network code for G which will also be secure. 



Now consider the problem of finding secure network codes 
for G. This problem will not change if the wiretapper is not 
allowed to wiretap the forwarding edges. Therefore, the set of 
edges that the wiretapper might have access to consists of the 
encoding edges and the edges outgoing from the source, and is 
of order n^t^+5, where 5 is the out-degree of the source. Now, 
applying Theorem |2] on G and taking into consideration the 
restriction on the edges that can be potentially wiretapped, we 
obtain the following bound on the sufficient field size which 
is independent of the size of the network. 

Corollary 1: For the transmission scenario of Thm. |2] a 
secure mulitcast network code always exists over the alphabet 
F„ of size 

V M-1 , 

For networks with two sources, we can completely settle 
the question on the required alphabet size for a secure network 
code. Note that the adversary has to be limited to observing at 
most one edge of his choice. Based on the work of Fragouli 
and Soljanin in [12], the coding problem for these networks 
is equivalent to a vertex coloring problem of some specially 
designed graphs, where the colors are actually the points on 
the projective line PG(1, q): 



q> 



t. 



(5) 



[0 1], [1 0], and [1 a*] for < i < <? - 2, 



(6) 



where a is a primitive element of F^. Clearly, any network 
with two sources and arbitrary number of receives can be 
securely coded by reducing the set of available colors in ^ by 
removing point (color) [1 1] and applying a wiretap code based 
on the matrix H — [11] as in the example above. Alphabet 
size sufficient to securely code all network with two sources 
also follows from [12]: 

Theorem 3: For any configuration with two sources t re- 
ceivers, the code alphabet F^ of size 



Lv/2t - 7/4 + 1/2J +1 

is sufficient for a secure network code. There exist configura- 
tions for which it is necessary. 

The wiretap approach to network security also provides the 
exact alphabet size and secure code for a class of networks 
known as combination networks and are illustrated in Fig. |2] 



There are 



/M\ 



receiver nodes. Note that each n nodes of the 




Fig. 2. Combination B(n, M) network. 

second layer are observed by a receiver It is easy to see that 
an [M + k,n] Reed Solomon code can be used, namely, the 



first k rows its parity check matrix can be used for the cosset 
code and the rest as the coding vectors of the M edges going 
out of the source. 

V. Connections with Other Schemes 

A number of connections between secure network coding 
with the concurrent work on network error correction can be 
observed [15], [16], [17]. We here describe the relationship 
between the proposed scheme and previously known construc- 
tions. Cai and Yeung were first to study the design of secure 
network codes for multicast demands [3]. They showed that, 
in the setting described above, a secure network code can be 
found for any k < n — jjl. Their construction is equivalent to 
the following scheme: 

1) Generate a vector R = (ri,r2, . . . , r^)^ choosing its 
components uniformly at random over Fg, 

2) Form vector X by concatenating the jjl random symbols 
R to the k source symbols S: 



X 



[si, 



.,Sfe,ri,. 



.? 



n X n matrix over F^ and a 



3) Chose an invertible 
linear code multicast (LCM) [2] to ensure the security 
condition ([T]). (It is shown in [3, Thm. 1] that such LCM 
and T can be found provided that q> (' ').) 

4) Compute Y — TX and multicast Y to all the destina- 
tions by using the constructed code. 

Feldman et al. considered also the same problem in [4]. 
Adopting the same approach of [3], they showed that in order 
for the code to be secure, the matrix T should satisfy certain 
conditions ([4, Thm. 6]), that we restate here for convenience: 
In the above transmission scheme, the security condition ^ 
holds if and only if any set of vectors consisting of 

1) at most /i linearly independent global edge coding 
vectors and/or 

2) any number of vectors from the first k rows of T^^ 

is linearly independent. They also showed that if one sacrifices 
in the number of information packets, that is, take k < n~ fi, 
then one can find secure network codes over fields of size 
much smaller than the very large bound g > (' '). 

We will now show that our approach based on coding for 
the wiretap channel at the source is equivalent to the above 
stated scheme [3] with the conditions of [4]. 

Claim 1: Let T and C be a matrix and a corresponding 
secure network code satisfying the above conditions. Set H = 
T* where T* is the fc x n matrix formed by taking the first k 
rows of T^^. Then H and C satisfy the condition of Thm. [T] 
Proof: Consider the secure multicast scheme of [3] as 
presented above. For a given information vector S G F^', let 
B{S) be the set of all possible vectors Y e F^ that could 
be multicast through the network under this scheme. More 
precisely. 



B{S) = {y e 



\Y = TX,X = 



,i?eF^-'=}. 



Then, for all Y e B{S), we have T*Y = T*T 



S. 



Therefore, any Y G B{S) also belongs to the coset of the 



space spanned by the rows of T* whose syndrome is equal to 
S. Moreover, since T is invertible, |i?(S')| = 2"^^^ implying 
that set B{S) is exactly that coset. The conditions of [4] as 
stated above directly translate into (O, the remaining condition 
of Thm. m ■ 

VI. Conclusion 

We considered the problem of securing a multicast network 
implementing network coding against a wiretapper capable of 
observing a limited number of links of his choice, as defined 
initially by Cai and Yeung. We showed that the problem can 
be formulated as a generalization of the wiretap channel of 
type 11 (which was introduced and studied by Ozarow and 
Wyner), and decomposed into two sub-problems: the first 
one of designing a secure wiretap channel code and the 
second of designing a network code satisfying some additional 
constraints. We proved there is no penalty to pay by adopting 
this separation, which we find in many ways illuminative. 
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